Metabolic profiling of milk thistle different organs using UPLC-TQD-MS/MS coupled to multivariate analysis in relation to their selective antiviral potential

Introduction Silybum marianum commonly known as milk thistle is one of the most imperative medicinal plants due to its remarkable pharmacological activities. Lately, the antiviral activities of S. marianum extract have been studied and it showed effectiveness against many viruses. Objective Although most previous studies were concerned mainly with silymarin content of the fruit, the present study provides comprehensive comparative evaluation of S. marianum different organs’ chemical profiles using UPLC-MS/MS coupled to chemometrics to unravel potentially selective antiviral compounds against human coronavirus (HCoV-229E). Methodology UPLC-ESI-TQD-MS/MS analysis was utilized to establish metabolic fingerprints for S. marianum organs namely fruits, roots, stems and seeds. Multivariate analysis, using OPLS-DA and HCA-heat map was applied to explore the main discriminatory phytoconstituents between organs. Selective virucidal activity of organs extracts against coronavirus (HCoV-229E) was evaluated for the first time using cytopathic effect (CPE) inhibition assay. Correlation coefficient analysis was implemented for detection of potential constituents having virucidal activity. Results UPLC-MS/MS analysis resulted in 87 identified metabolites belonging to different classes. OPLS-DA revealed in-between class discrimination between milk thistle organs proving their significantly different metabolic profiles. The results of CPE assay showed that all tested organ samples exhibited dose dependent inhibitory activity in nanomolar range. Correlation analysis disclosed that caffeic acid-O-hexoside, gadoleic and linolenic acids were the most potentially selective antiviral phytoconstituents. Conclusion This study valorizes the importance of different S. marianum organs as wealthy sources of selective and effective antiviral candidates. This approach can be extended to unravel potentially active constituents from complex plant matrices. Supplementary Information The online version contains supplementary material available at 10.1186/s12906-024-04411-7.


Introduction
Silybum marianum (L.) Gaertn is an annual or biennial plant belonging to family Asteraceae [1].It has many common names, the most widely known one is milk thistle [2].It is native to the Mediterranean districts of Northern Africa, Southern Europe, and Western Asia, Page 2 of 29 El-Banna and Ibrahim BMC Complementary Medicine and Therapies (2024) 24:115 but now it is cultivated throughout the whole world [3], either as vegetable, medicinal or as ornamental plant [4].
Owing to its various beneficial effects, S. marianum is among the most-selling botanical dietary supplements worldwide with an average sale of about US$ 8 billion/ annum [5].Recently, the milk thistle supplements market has globally expanded due to the COVID-19 outbreak, that led to increased need for immunomodulating supplements, in addition to the growing demand of effective anti-inflammatory, anti-aging and skin care natural products.In foods, its leaves and flowers are consumed as a vegetable for salads and a substitute for spinach.Milk thistle seeds can be used in raw form or made into tea.They can be also roasted for use as a coffee substitute [6].
Lately, the antiviral activities of S. marianum extract have been studied and it showed effectiveness against many viruses such as the flaviviruses (hepatitis C virus and dengue virus) [21,22], human immunodeficiency virus [23], togaviruses (Chikungunya virus and Mayaro virus) [24,25], hepatitis B virus [26] and influenza virus [27].The remarkable antiviral efficacy of S. marianum extract is attributed to its multi-target activity against host cell.As it showed ability to modulate cell innate immunity [28,29], inflammation [30], oxidative stress [31] and autophagy [32], which are cellular processes impaired by the viral invasion.In addition to the modulation of the cell environment, S. marianum extract also showed ability to exert direct potent antiviral actions against viral proteins [33].These findings encouraged the researchers to assess the effectiveness of S. marianum against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of COVID-19 pandemic.It was computationally found to act as inhibitor of signal transducer and activator of transcription 3 (STAT3), the main modulator of inflammatory and immune response.In addition, it was predicted to inhibit RNA-dependent RNA polymerase (RdRp), the main protein responsible for SARS-CoV-2 replication and transcription [34].
The first human coronavirus (HCoV) strain was found out in 1965.Afterwards, additional 30 strains were recognized, from which HCoV-229E was the prototypic strain that HCoV research focused on until [2002][2003], where severe acute respiratory syndrome coronavirus (SARS-CoV) was flared up.Thereafter, the Middle East respiratory syndrome coronavirus (MERS-CoV) and the 2019 novel coronavirus (SARS-CoV-2) have broken out [35,36].Differently from SARS-CoV, MERS-CoV and SARS-CoV-2 that bring about severe respiratory disease, HCoV-229E usually leads to mild to moderate upper-respiratory tract ailment, contributing to about 15-30% of human common cold cases [35].HCoV-229E is an enveloped, single-stranded RNA virus.It is a member of Alphacoronavirus genus and Duvinacovirus subgenus [37].
S. marianum is rich in diverse secondary metabolites, including silymarin (which is a mixture of flavonolignans), phenolics, fatty acids and other chemical constituents [38].The majority of previous studies focus only on the phytochemical and biological investigation of the flavonolignans constituents of S. marianum seeds and fruits [39][40][41][42][43][44][45][46][47][48], and up to authors' knowledge there are not previous work on studying the whole metabolome and antiviral activity of all different parts of milk thistle.Therefore, the study in hand aims to investigate the whole chemical profile of different S. marianum organs including fruits, leaves, stems and roots using HPLC-MS/MS and chemometric analysis for the first time and to couple these data with the antiviral activity of these organs aiming at valorizing the unused milk thistle parts.The orthogonal projections to latent structures discriminant analysis (OPLS-DA) was performed to examine the class discrimination between the tested extracts and reveal the chemical markers accountable for such discrimination.Afterwards, the antiviral potentials of the tested extracts against HCoV-229E were determined on African green monkey kidney (Vero E6) cells using cytopathic effect (CPE) inhibition assay.Thereafter, different chemometric models were constructed to identify the biological markers responsible for the bioactive segregation of the studied extracts to exploit them as potential sources of valuable antiviral agents.

Collection of the plant material
Five separate samples of the plant material were collected during the flowering-fruiting stage from farms belonging to the Faculty of Agriculture, Alexandria University, Egypt, in July 2022.The plant identity was confirmed via comparison with herbal sample present in the herbarium of the Faculty of Science, Alexandria University, Egypt.A voucher specimen (SM2022) was held at the Department of Pharmacognosy-Faculty of Pharmacy-Alexandria University.The collected plant materials were allowed to dry at room temperature prior to phytochemical analysis.

Preparation of samples
Every plant sample was split up into four organs: fruits, leaves, roots, and stems.Every organ sample (100 g) was extracted individually by ultrasonication in 200 mL of 70% ethanol using an ultrasonic bath 28 kHz/1100 W for 30 min at 45 ºC twice.The filtrates of each organ were collected and evaporated to dryness using a rotary evaporator, under reduced pressure, at 45ºC to get a total of 20 samples.

Profiling the metabolome of S. marianum different organs extracts using UPLC-MS/MS Samples preparation for UPLC-MS analysis
Methanolic solutions with concentrations of 1 mg ml −1 were prepared for each sample.These solutions were subjected to filtration via membrane filters (0.2 μm) and degassing by sonication before being analyzed via LC-MS.To ensure reproducibility, the above process was performed three times for every sample.

UPLC-ESI-TQD -MS analysis
Chromatographic parameters and conditions The UPLC system consists of a Waters Acquity QSM pump, an LC-2040 (Waters) autosampler, degasser and Waters Acquity CM detector.10 µL of each of the previously prepared samples (full loop injection volume) were separately injected into the chromatographic column three times.Chromatographic separation was conducted using a Waters Acquity UPLC BEH C18 column (50 mm × 2.1 mm ID × 1.7 μm particle size) operating at a flow rate of 0.2 mL/min and thermostating at 30 °C.

ESI-MS parameters and conditions
For LC/MS analysis, a triple quadrupole mass spectrometer was coupled to the UPLC instrument via an ESI interface.Ultrahigh purity helium (He) was used as the collision gas and high purity nitrogen (N2) as the nebulizing gas.The mass spectrometer was monitored in negative ionization mode over 50-1200 m/z mass range.The optimized detection parameters were as follows: temperature 150 °C, cone voltage 30 V, capillary voltage 3 kV, desolvation temperature 440 °C, cone gas flow 50 L/h, and desolvation gas flow 900 L/h.A source fragmentation voltage of 25 V was applied.The mass spectrometer was operated in negative ion mode in order to identify the molecular ions [M-H] − followed by MS/MS product ion experiments to study the fragmentation pattern of the constituents.The analysis process run time lasted for 40 min.Regarding automatic MS/MS fragmentation process of the precursor ions that have been filtered by the first quadrupole (Q1), the mass fragmentation was performed through collision-induced dissociation (CID) energy utilizing Ultra-high purity helium in the second quadrupole (Q2).Eventually, the third quadrupole mass analyzer (Q3) filtered the daughter ions produced from CID that consequently related to the molecular structure of the precursor ions.The collision energy for CID in tandem mass spectrometry analysis, was optimized for each compound, in order to acquire mass spectra with various fragmentation degrees from the precursor ion thus attaining as much structural information as possible.A data-dependent program was utilized for tandem mass spectrometry data acquisition.In this program, molecular ions detected in the negative ion mode were selected for MS2 analysis and the two most abundant fragment ions in the MS2 spectra were then selected for further MS3 fragmentation.

Annotation of UPLC-MS/MS compounds
The raw UPLC-MS data were pre-processed using Mzmine ® version 2.8 software that has been utilized for importing data, chromatogram building, peak deconvolution, alignment and annotation.Tentative assignment of metabolites was established via comparing their retention times relative to standards (which were caffeic acid, malic acid, quercetin, coumarin, p-coumaryl alcohol, lanosterol, and linoleic acid that were used as standards to their respective chemical classes), interpreting tandem mass spectra (quasi-molecular ions as well as diagnostic MS/MS fragmentation profiles) combined with our in-house comprehensive database that was set up covering all compounds previously reported in the literature including Dictionary of Natural Products (https:// dnp.chemn etbase.com/), PubChem and Mass-Bank (https:// massb ank.eu/ MassB ank/) to provide high confidence level of annotation [49,50].

Semi-quantitation of identified compounds using UPLC-MS/ MS
The annotated compounds were semi-quantified in accordance with their chemical class by the use of standard compound solutions.Caffeic acid, malic acid, quercetin, coumarin, p-coumaryl alcohol, lanosterol, and linoleic acid were used as standards for their chemical classes, and they were procured from Sigma-Aldrich (St. Louis, Mo., USA).Stock methanolic solutions, each with concentration of 1 mg ml −1 , were prepared for every standard compound.These stock solutions were then diluted to generate working concentrations extending from 0.0125 to 0.625 mg mL −1 using HPLC-grade methanol (Table 1).Each standard solution concentration was analyzed three times under the previously described conditions in UPLC-ESI-TQD -MS analysis section.The standards were analyzed in the same order shown in Table 1: caffeic acid, then malic acid, quercetin, coumarin, p-coumaryl alcohol, lanosterol, then linoleic acid.The calibration curves were constructed by plotting standards peak areas versus their concentrations.For each calibration curve, the equation is y = ax + b, where y is the peak area, x is the concentration of the standard (mg mL −1 ), a is the intercept, b is the slope and r is the correlation coefficient.

Multivariate statistical analysis
Semiquantitative analysis and biological activity testing were statistically analyzed via ANOVA (one-way analysis of the variance) hiring SPSS 26.0 program (SPSS Inc., Chicago, IL.USA) and Metaboanalyst 4.0 (http:// www.metab oanal yst.ca/) which is a web-based tool for processing metabolomics data to construct hierarchical cluster analysis (HCA) heat maps.
In addition, SIMCA v 14 software (Umetrics, Sweden) was applied for the construction of Orthogonal Projections to Latent Structures-Discriminant Analysis model (OPLS-DA) followed by Orthogonal Projections to Latent Structures (OPLS) model that enabled the discrimination of different milk thistle organs extracts based on their chemical profile in addition to antiviral activity.OPLS-DA model enabled the identification of the phytoconstituents that generated such discrimination.Meanwhile, careful examination of the OPLS correlation coefficient plots enabled us to identify the metabolites strongly correlated to the investigated biological activity.Permutations plots were created to validate that the created models were not modelling the noise or over-fitted.

Selective virucidal activity of S. marianum different organs extracts against human coronavirus (HCoV-229E) using cytopathic effect (CPE) inhibition assay
The crystal violet method was used to evaluate antiviral and cytotoxic activities according to Schmidtke et al. (2001) [51].In brief, Vero E6 cells (Nawah-Scientific, Egypt) were seeded into a 96-well plate at a density of 2 × 10 4 cells/well one day before infection.Vero E6 cells were cultured in DMEM with 10% fetal bovine serum (FBS) and 0.1% antibiotic/antimycotic solution provided by Gibco BRL (Grand Island, NY, USA).After removing the culture medium the next day, the cells were washed with phosphate-buffered saline.Determination of coronavirus 229E (Nawah-Scientific, Egypt) infectivity was performed using the crystal violet method to monitor CPE and calculate the percentage of cell viability.0.1 mL of diluted viral suspension of 229E virus with CCID 50 (50% cell culture infective dose of virus stock) was added to mammalian cells to attain the desired CPE after infection.Regarding samples' treatments, 0.01 mL of desired extract-containing medium was added to the cells.Each test sample's antiviral activity was estimated by a two-fold diluted concentration range of 0.1-100 µg/mL.The virus controls (virus-infected, non-drug-treated cells) and cell controls (non-infected, non-drug treated cells) were used.For 3 days, culture plates were incubated at 37 o C in 5% carbon dioxide.The development of CPE was monitored by light microscopy.Following a PBS wash, fixation then staining of the cell monolayers was done using a 0.03% crystal violet solution in 2% EtOH and 3% formalin.Following washing and drying the optical densities (OD) of Prior to conducting this assay, we assessed the cytotoxicity on normal cells, cells were seeded at a density of 2 × 10 4 cells/well in 96-well plate.The next day, the serially diluted extracts-containing culture media were added to the cells then incubated for 48 h then removed and the cells were washed with PBS.The following steps were performed as previously illustrated in the antiviral activity assay.GraphPad PRISM V 8 (San Diego, USA) software was used for determination of 50% cytotoxic concentrations (CC50) and 50% inhibitory concentrations (IC50).

Characterization of metabolites in S. marianum different organs
Metabolite profiling of S. marianum tested extracts was accomplished using UPLC-MS-MS.Figure 1  by their retention times comparison to references and examining their MS data (Table 2).The details of characterization and fragmentation patterns of the identified metabolites are illustrated below.
Moreover, the MS spectrum of compound 12 revealed [M-H] − ion at m/z 353.3, in addition to characteristic daughter fragments at m/z 179.15 and 191.16 representing caffeic and quinic acids, respectively.Therefore, compound 12 was recognized as monocaffeoylquinic acid (chlorogenic acid) [56].Also, quinic acid was present as free acid as shown in compound 4 spectrum which had a base peak [M-H] − ion at m/z 191.16.
Compound 13 displaying a parent ion peak at m/z 275.28 was proposed to be ursinoic acid.This annotation was suggested by its MS/MS fragment ion at m/z 201.25 [M-H-44-30.03]− owing to CO 2 and methoxy group elimination [57].

Dicarboxylic acids
Two dicarboxylic acids were recognized.Malic acid (compound 14) was proposed for the parent ion at m/z 133.08 which was then fragmented to yield peaks at m/z 115.06 and 71.06 due to successive loss of H 2 O and CO 2 .Whereas fumaric acid (compound 15) was proposed for the parent ion at m/z 115.06 which decarboxylated to give fragment ion at m/z 71.06 [58].These compounds together with kaempferol (compound 46) exhibited fragment ions at m/z 151.1 and 133.12 due to RDA reaction [85].Compound 47 revealed a parent peak at m/z 287.24 which is 2 Da higher than that of kaempferol.Therefore, it was characterized as dihydrokaempferol (aromadendrin).It gave RDA fragments at m/z 151.1 and 135.14 [59].

Flavonoids
Similarly, rutin (compound 17) showed a distinguishing ion at m/z 301.23  showed fragment ions at 300.22 and 271.2 due to eliminating [CH 3 ]-and [CH 3 + CHO] − , respectively.They also showed the characteristic RDA fragment at m/z 151.1 [62].Compounds 19, 22, and 24 exhibited their sugar moieties in the second RDA fragment that was at m/z 487, for compound 19; at m/z 339, for compound 22; and at m/z 325, for compound 24, indicating that ring A of these compounds were free of sugar moieties.
In addition, naringin (compound 20), naringenin 7-Ohexoside (compound 27) and naringenin (compounds 51) were fragmented similarly except that naringin and naringenin 7-O-hexoside had extra 308.28 and 162.14 Da corresponding to disaccharide and monosaccharide moieties, respectively.These compounds showed RDA fragments at m/z 151.1 and 119.14, and a fragment at 107.09 corresponding to [151.1-CO 2 ] − [63].Compound 52 having additional 2 Da to the molecular ion of naringenin was suggested to be dihydronaringenin (phloretin).It was affirmed by its RDA fragments at m/z 151.1 and 121.16 [70].Whereas compound 67 with [M-H] − at m/z 407.48 and fragment ions at m/z 119.14 and 287.33 was proposed to be 6,8-diprenylnaringenin.It was identified by having extra 136 Da in its molecular ion and RDA fragment than those of naringenin [76].[75].On the other hand, compound 53 was assigned to be the methoxylated flavone, tricin.It demonstrated its molecular ion at m/z 329.28 which was subjected to fragmentation to generate ions at m/z 314.25, 299.22 and 271.21 that are related to successive loss of two methyl and carbonyl groups [71].
Moreover, compounds 28, 31 and 45 displaying molecular ions at m/z 457.36, 441.37 and 305.26 were characterized as epigallocatechin gallate, catechin gallate and epigallocatechin, respectively.The characterization was relied on the fragment ion at m/z 125.1 for the three compounds corresponding to [C 6 H 5 O 3 ] − , which was originated after two bonds cleavage in ring C and it was composed of the phenolic ring A [64].Catechin gallate was distinguished from epigallocatechin gallate and epigallocatechin by the existence of the fragment ion [C 6 H 5 O 2 ] − at m/z 109.1 which was corresponding to ring B of catechin gallate.Epigallocatechin was differentiated from epigallocatechin gallate and catechin gallate by the lack of the fragment ion [C 7 H 5 O 5 ] − at m/z 169.11 indicating the lack of gallate moiety attached to 3-OH [64].
Additionally, chalcone (compound 56) was recognized by the parent peak at m/z 207.25 and the daughter ion peaks in the MS2 spectrum at m/z 130.14, corresponding to the fragment ion [C 9 H 7 O-H] − which was formed by loss of one phenolic ring, and 102.13, related to cleavage of 1, 2 bond [73].Compound 55 with additional 32 Da at the molecular ion and fragmented in the same way as chalcone was recognized as 2' ,4'-dihydroxychalcone [72].
The most characteristic components of S. marianum is silymarin mixture.It was identified in the mass spectra by seven compounds, which are silyamandin (compound 57), silychristin (compound 59), silydianin (compound 60), silybin A (compound 61), silybin B (compound 62), isosilybin A (compound 63) and isosilybin B (compound 64).Silybin A and B, isosilybin A and B, silydianin and silychristin all had their [M-H] − ions at m/z 481.43, while silyamandin exhibited its [M-H] − ion at m/z 497.43.All seven compounds had similar fragment ions at m/z 463, 453, 179 and 125.Both silybin and isosilybin generated the following fragment ions in common; 435, 301, 283, 273, 257 m/z.However, silybin produced a fragment ion at m/z 423 that was not generated in case of isosilybin.On the other hand, silydianin produced characteristic fragment ions at m/z 409, 151 and 301 m/z.Meanwhile, silychristin produced the following fragment ions at m/z 433, 423, 355, 337 and 325 m/z.Finally, silyamandin exhibited distinguishable peaks at m/z 480, 470, 375 and 355.The fragmentation patterns of these flavonolignans were similar to those explained in literature [74].Another two compounds were identified which are 2,3-dehydrosilybin (compound 58) and silandrin (compound 66).2,3-dehydrosilybin was suggested by its molecular ion at m/z 479.41 which is 2 Da lower than that of silybin.Whereas silandrin (isosilybin; 3-deoxy) was proposed by its molecular ion at m/z 465.43 which is 16 Da lower than that of silybin.

Coumarins
Coumarin (compound 37) was proposed for the parent ion at m/z 145.14 which was further fragmented to yield ion at m/z 117.13 due to loss of carbonyl group [67].Compound 36 with extra 16 Da in both parent and daughter ions was identified as 4-hydroxycoumarin [67].Compounds 39 with additional 14 Da than compound 36 was suggested to be 4-Methylumbelliferone.It fragmented in the same way as compounds 36 and 37 by loss of carbonyl group to give fragment ion at m/z 147.15 [67].Additionally, compound 42 giving parent peak at m/z 217.2 was proposed to be 4-Methylumbelliferyl acetate.It exhibited daughter ion peak at m/z 175.16 [66].Furthermore, compound 38 giving rise to deprotonated ion at m/z 149.17 was annotated as p-coumaryl alcohol.Upon fragmentation, it generated a daughter ion peak at m/z 131.15 because of water loss.Similarly, compounds 40 and 41 having molecular ions at m/z 181.21 and fragment ions at m/z 163.19 [M-H-H 2 O] − were identified as 2-hydroxymethyl-5-(2-hydroxypropan-2-yl) phenol and p-mentha-1,3,5-triene-2,7,8-triol, respectively [86].
Moreover, the MS spectra of compounds  58, respectively, together with their fragment ions related to neutral loss of H 2 O and CO 2 .These compounds were recognized as 12-oxo-phytodienoic acid, linolenic acid, linoleic acid, oleic acid, gadoleic acid, myristic acid, palmitic acid, stearic acid, arachidic acid and behenic acid, respectively [80].
Furthermore, two fatty acids methyl esters were recognized by their deprotonated ions at m/z 293.46 (compound 79) and 295.48 (compound 80).They yield fragment ions in their MS2 spectra at m/z 262.43 (for compound 79) and 264.45 (for compound 80) because of methoxy group loss, and at m/z 221.4 corresponding to lack of McLafferty ion which is characteristic to methyl esters [82].Compound 79 had a base peak at m/z 81.14 due to lack of the hydrocarbon ion [C 6 H 9 ] -, while compound 80 demonstrated a base peak at m/z 55.1 owing to lack of the hydrocarbon ion [C 4 H 7 ] - [83].Accordingly, these compounds were proposed to be linoleic acid methyl ester (compound 79) and 16-octadecenoic acid methyl ester (compound 80).
Finally, (R)-gamma-tocotrienol (compound 87) was identified by its parent ion at m/z 409.63  Thereafter, all identified compounds in S. marianum samples were subjected to relative quantitation via the calibration curves illustrated in Semi-quantitation of identified compounds using UPLC-MS/MS section.and their relative contents are presented in Fig. 2 & Table S1.

Unsupervised HCA-heat map for chemical profiling of S. marianum different organs
In this section, comparative chemical profiling of S. marianum fruits, leaves, roots and stems was attempted using UPLC-tandem mass analysis combined with multivariable statistical analysis.The semi-quantitative data of characterized compounds in the previous section (Table S1) were implemented to conduct an unsupervised dendritic analysis for the extracts under investigation.As shown in Figs. 1, 2 and 3 considerable variation in chemical profile of milk thistle different organs was observed.Hierarchical clustering analysis (HCA)-heat map (Fig. 3) showed the grouping of the different S. marianum organ extracts into three separate clusters, the first was assigned for the five fruit samples, the second allocated for the root samples while the third one was split into two subclusters; namely the leaves and stem samples indicating relative proximity in their chemical composition and proving the previous findings gained by Javeed et al. that calculated the total phenolic and flavonoid contents in different milk thistle parts and revealed that the leaves and stems extracts were enriched with higher amounts of them [87].
It was observed that S. marianum fruit samples possessed the highest relative content of flavonolignans such as 2,3-dehydrosilybin, silymin A, silydianin, isosilybin A, silybin A and B, and this finding is in a good agreement with that reported by Korany et al. [39].Other metabolic classes were diversely distributed among different organ clusters and subclusters as indicated by the dark red color code (Fig. 3).Four compounds were found in all the studied milk thistle organs, namely the phenolic acid "cinnamic acid" which recorded highest accumulation in stems and two flavonoidal aglycones namely naringenin and tricin which were both highly accumulated in fruits followed by seeds, and finally the fatty acid ester "16-octadecenoic acid methyl ester" which was also detected in greater amount in milk thistle fruits.Meanwhile, close observation of Fig. 3. revealed that the two coumarins 4-methylumbelliferone and 4-hydroxycoumarin in addition to 6,8-diprenylnaringenin, 2-hydroxymethyl-5-(2-hydroxypropan-2-yl) phenol and the two flavonolignans silybin A and B were the main constituents in fruit samples.In contrast, milk thistle leaves samples exhibited greater accumulation of the phenolic acid glycoside "caffeic acid-O-hexoside" and the aglycone "quinic acid".Further, isorhamnetin, gallic acid hexoside, bergenin, caffeic acid-O-hexoside, and apigenin were the major compounds found in roots samples.Finally, cinnamic and syringic acids, genistein, apigenin-7-Ohexoside, taxifolin and phloretin were the main detected constituents in milk thistle stems.
It is worthy to mention that, up to the authors' knowledge, this is the foremost comprehensive evaluation of S. marianum different organs chemical profiles.

OPLS-DA for supervised multivariate discrimination between different organs
For the sake of inter-and intra-class discrimination of fruits, leaves, roots and stems samples; an OPLS-DA multivariate model was created utilizing the MS data obtained from LC-MS/MS analysis (Fig. 4A and B).Moreover, OPLS-DA was able to unravel the discriminatory markers characteristic for each class chemical profile via coefficient plots of each organ separately (Fig. 4C and F).The first and second latent variables of the constructed model accounted for 46.3% and 30.1% of the variability, respectively.Moreover, the model exhibited high reliability and prediction ability represented by high goodness of fitness (R 2 = 0.998) and goodness of prediction (Q 2 = 0.996).For validation of the current OPLS-DA model; permutation plots for fruits, leaves, roots and stems (Fig. S1) using 20 permutations for each class were constructed.The blue regression line of Q 2 points intersected with vertical axis below the zero, while the green R 2 values to the left were lower to the original point to the right which strongly indicated the model validity.ROC curves (Fig. S2) were constructed, and AUC were found to be equal to one for all classes indicating the excellent classification power of the model.
In between class discrimination along the first latent variable (LV 1 ) was observed in the 2D score scatter plot (Fig. 4A) where all the fruit samples where successfully grouped along its positive side, while other organ samples were on the negative side of LV 1 .Whereas the second latent variable (LV 2 ) successfully separated the root samples on its negative side from the leaves and stem samples on the positive side of the same LV.This classification was in agreement with that observed OPLS-DA Fig. 2 Relative quantitation of the total phenolics, dicarboxylic acids, flavonoids, coumarins, alcohols, triterpenes, and fatty acids in different organs of S. marianum expressed as mg Equivalents (Eq.)/g dry weight Fig. 3 Hierarchical analysis heat maps of all identified metabolites in fruit, leaves, roots and stems of S. marianum.Brick red and blue indicate higher and lower abundances, respectively dendrogram (Fig. 4B) where it revealed the existence of two principal clusters; one for the fruit samples and the other was additionally sub-clustered into a subcluster for roots and another one comprised of tested stems and leaves samples.The OPLS-DA coefficients plots (Fig. 4C and F) allowed the recognition of phytoconstituents responsible for the segregation of each milk thistle organ samples into separate class.Caffeic acid, naringenin 7-O-hexoside, silydianin, silybin B, isosilybin A and silybin A were the main differentiating markers certainly correlated to fruit samples (Fig. 4C).Meanwhile, daidzein-7-O-hexoside, silandrin, linolenic acid, 1,3-tridecadiene-5,7,9,11-tetrayne 1,2-epoxide, kaempferol 3,7-dihexoside, isorhamnetin-3-O-hexuronide and isosilybin B were found to be the foremost constituents related to leaves samples class (Fig. 4D).In contrast, the flavonoidal aglycone isorhamnetin, coumaroyl hexoside, behenic acid, 12-tridecene-4,6,8,10-tetraynal, p-coumaric acid and ononin were the positively related compounds to root class (Fig. 4E).Moreover, the flavones genistein, dihydrokaempferol and apigenin-7-O-hexoside, the phenolic acids cinnamic, syringic and chlorogenic acid, silychristin, phloretin and linoleic acid were the principal differentiating markers showing positive correlation to stems class (Fig. 4F).

Selective virucidal activity of the tested S. marianum organs extracts against human coronavirus (HCoV-229E) using cytopathic effect (CPE) inhibition assay
The appearance of drug-resistant respiratory viral strains to currently used antivirals such as oseltamivir, zanamivir, peramivir, and laninamivir [88] makes the development of natural selective alternatives with diminished toxicity urgently required.In this context, selective virucidal activity of the tested milk thistle organs extracts against human coronavirus (HCoV-229E) using cytopathic effect (CPE) inhibition assay was performed for the first time.The CPE-inhibition assay was used to identify potential antivirals against human coronavirus 229E.The dose-response assay was designed to determine the range of efficacy for the chosen antiviral, i.e. the 50% inhibitory concentration (IC50), as well as the range of cytotoxicity (CC50).This assay is a critical and a well-reputable tool to assess the efficacy of several synthetic and natural agents against many viruses such as metapneumoviruses [89], influenza viruses [90], enteroviruses [91], and herpes simplex virus [92], among others.Selectivity index (SI = cytotoxicity/bioactivity) appeared to be an indispensable parameter to evaluate during the exploring process of novel antiviral candidates rather than focusing only on pharmacological or toxicological parameters separately [93].As revealed in Fig. 5 all the tested milk thistle organs samples exhibited dose dependent inhibitory activity on HCoV-229E in nanomolar range.The results were compared to the positive control (remdesivir ® ) (Table 3).Comparison of the IC50 values of organs samples disclosed that fruit samples had the smallest IC50 value among all tested organs of 667.6 ± 0.5 ng/mL indicating its higher activity against the tested HCoV-229E virus while the leaves possessed the largest IC50 value of 2151 ± 0.9 ng/mL.On the contrary, low 50% cytotoxic concentration (CC50) on Vero E6 cells represent an indication of high toxicity of the tested samples on normal cells.Milk thistle fruits possessed the lowest CC50 of 3195 ± 0.3 ng/mL indicating the highest cytotoxicity among other samples (Fig. 5).Meanwhile, the leaves recorded the lowest toxicity with CC50 of 14,598 ± 1.2 ng/mL.Selectivity index (SI) was then calculated by dividing cytotoxicity as pCC50 (-log CC50 in g/L) on HCoV-229E antiviral activity as pIC50 (-log IC50 in g/L) to inspect the samples of high selectivity to virus infected cells without causing toxicity to normal cells (Table 3).The lower the selectivity index the more selective the tested sample.Leaves samples possessing low pCC50 and high pIC50 values, that subsequently yielded low selectivity index, are promising anti-human coronavirus 229E drug-like candidates.Although many other researchers have documented the antiviral efficacy of silymarin and milk thistle supplements [33,[94][95][96], the present study is the first to compare the antiviral efficacy of different milk thistle organs aiming at valorizing the unused plant parts.

Correlation analysis to selective antiviral activity for unraveling bioactivephytoconstituents from the tested S. marianum organs samples
OPLS model and its accompanying correlation coefficient analysis were implemented for detection of significant phytoconstituents having selective virucidal activity against human coronavirus (HCoV-229E) amongst the four milk thistle organ samples studied, as well as evaluating consequent classification of the samples based on bioactivity.The biplot of the constructed OPLS model (Fig. 6A) exposed in-between class discrimination of fruits and stems from roots and leaves samples where the first exhibited spatial relation to cytotoxicity represented as pCC50 and antiviral activity on HCoV-229E as pIC50 while the later classes were in proximity to PSI indicating better selectivity.Further, studying the coefficient plots (Fig. 6B and D) portrayed that 16-octadecenoic acid methyl ester, taxifolin, cinnamic and chlorogenic acids were shown to be the constituents possessing the highest positive correlation to HCoV-229E inhibitory activity (Fig. 6B).While 16-octadecenoic acid methyl ester, taxifolin, tricin and naringenin were the major metabolites positively related to cytotoxic activity on normal cells (Fig. 6C).Finally, Fig. 6D indicated that caffeic acid-O-hexoside, gadoleic and linolenic acids, daidzein-7-O-hexoside, apigenin-7-O-hexoside ethyl ester and coumarin were the most potentially selective anti-human coronavirus 229E phytoconstituents.
These findings are consistent with previous research that found that octadecenoic acid derivatives can bind to several coronaviruses' proteins such as RNAdependent RNA polymerase, main protease, and spike protein S1 to degrees similar to those possessed by the known antiviral drug umifenovir [97].It also exhibited activity against influenza A and B viruses [98].In addition, taxifolin showed the ability to inhibit the replication of HCoV-229E in Huh-7 cells at 2.5 µM and this inhibitory activity augmented with increasing its concentration.This activity was explained by its ability to inhibit the viral main protease activity [99].Moreover, chlorogenic, caffeic, linolenic acids, and daidzein were found to inhibit HCoV S-glycoprotein attachment to host cells.This was illustrated by their ability to impair the function of HSPA5 SDBβ, which is the binding site for viral S-glycoprotein [100].Furthermore, tricin was found to have antiviral activities against influenza A and B strains by inhibiting viral mRNA synthesis [101].Besides, naringenin was found to inhibit cytopathic effect in Vero E6 cells infected with SARS-CoV-2 in a time and concentration-dependent manner.This effect was explained by its ability to inhibit endo-lysosomal Two-Pore Channels (TPCs), a pathway facilitating viral entry to host cell [101].Further, apigenin and coumarins were found to be SARS-CoV-2 main protease inhibitors, thus inhibiting viral replication in the host cell [102,103].

Conclusion
This study provides the first comparative evaluation of the metabolomes of S. marianum different organs applying UPLC-MS/MS coupled with multivariate analysis.HCA-heat map and OPLS-DA revealed inbetween class discrimination between fruits and the other organs samples, in addition to within class discrimination between root samples which were separated from the leaves and stem samples.The OPLS-DA coefficients plots allowed the recognition of phytoconstituents responsible for the segregation of each organ samples into separate class.All studied S. marianum organs extracts were tested for selective virucidal activity against human coronavirus (HCoV-229E), and they all exhibited dose dependent inhibitory activity in nanomolar range with variable degrees of safety, efficacy, and selectivity.OPLS model and its accompanying correlation coefficient analysis were implemented  for detection of significant phytoconstituents having effective, safe, and selective antiviral potential amongst the four studied S. marianum organs.The study in hand valorizes the importance of different S. marianum organs as wealthy sources of valuable antiviral agents.The future work will be the isolation of the recognized promising antiviral phytoconstituents from different milk thistle organs, followed by extensive in vitro and in vivo testing of their biological activities to afford more conclusive and comprehensive therapeutic approaches that enable to introduce these drugs to the market.

Kaempferol 3 , 7 -
dihexoside (compound 16) generated RDA fragments at m/z 313 and 295, indicating that both rings A and B of the flavonoidal structure contained a hexose moiety.It also demonstrated a characteristic ion at m/z 285.23 due to [C 12 H 18 O 10 ] − loss.Rhamnocitrin-O-hexoside (compound 29) produced the same fragment ion because of successive loss of methyl and hexose units.

Fig. 4
Fig. 4 OPLS-DA scatter plot (t1 scores vs. t2 scores) (A), Dendrogram derived from the hierarchical cluster analysis, based on the Ward method of fruits, leaves, roots and stems samples of S. marianum (B), Coefficient plots of the OPLS-DA model of S. marianum fruits (C), leaves (D), roots (E), and stems (F) to determine discriminative metabolites

Fig. 6
Fig. 6 Orthogonal Projections to Latent Structures (OPLS) biplot of the tested samples in correlation to the bioactive markers (A).Coefficient plots of OPLS model in order to determine biomarkers responsible for the antiviral activity (PIC50) (B), cytotoxicity (PCC50) (C), and selectivity (PIS) (D)